Operator Overloading

Operator overloading is a powerful feature. It gives you the ability to replace the standard infix notation with your own, i.e., writing a+b will be understood as a function call to a function defined by you called ‘+’. This feature has a multitude of applications. Among these are arrays bound checking and implementation of new numeric types, etc. Its usage has been utilized extensively in the C++ language; so many people should be familiar with it.

Usage

Operator overloading will be considered by the compiler as an option only for user defined types, i.e., structures. Please note that this makes it impossible to redefine the basic types, such as int or double.

The following typedef is assumed in the examples below:

typedef struct tagComplex {

double x;

double y;

} COMPLEX;

To define a new operator, proceed as follows:

COMPLEX operator +(COMPLEX left, COMPLEX right)

{

COMPLEX result;

result.x = left.x + right.x;

result.y = left.y + right.y;

return result;

}

Note several things here:

This declaration can happen only at the global level, like all function declarations.
This function returns a NEW object.
The input arguments to this function are pointers to objects, or as in the example, the objects themselves passed by value.
This declaration is equivalent to a function you would have written with a long name (see how this name is derived below). Each time the compiler finds a match for this function call within an addition, it will call this function.

Assume then that after this definition is seen by the compiler, you write the following code:

int example(void)

{

COMPLEX a = {2.0,0.0},b = {3.0,0.0},c;

c = a+b;

}

This instruction will be interpreted as:

c = _op_plus_COMPLEX_COMPLEX(a,b);

The name of the function is derived as follows:

It starts with the prefix _op_ .

Followed by the name of the operator. The names are documented below.

Followed by an underscore to separate types, except for the last type.

If any spaces appear in the type, they will be substituted by an underscore. Thus ‘unsigned int’ will become ‘unsigned_int’.

Rules

All operators should have at least one argument that is a user-defined structure.
Type conversions can be realized by defining different operators for different input arguments. You can write several operators +, each with different arguments.

Here is an example. We implement addition of c = b + double precision value in the above context.

COMPLEX operator +(const COMPLEX left, const double right)[1]

{

COMPLEX result;

result.x = left.x + right;

result.y = left.y;

return result;

}

Then you can write in your code:

COMPLEX c = a + 5.9;

and the compiler will call the right operator for you. Note that the compiler-generated name for this operator will be _op_plus_COMPLEX_double, which differentiates it from the preceding example.

Operator Arguments

In this table, arguments that are preceded with const should not be modified by the operator function. This has not yet been verified in lcc-win32, but you should not rely on this.

The column titled ‘Pointer args?’ refers to the arguments of the operator redefining function. Can it use just pointers as argument or must it have at least one argument that is a structure or a reference to it?

In general, operations that are defined by the C language for pointers need at least one argument that is NOT of pointer type to avoid ambiguities in the language. When you write

char *p = “Hello”;

p++;

should the compiler call the redefined operator ++ or just increment the pointer? There is no way to know, so an operator ++ should never have a pointer as an argument. On the other side, division of pointers is not defined. If the redefined operator ‘/’ uses pointers as arguments, no problems can arise. Note that this is much more general than C++, which prohibits an operator to be redefined when only pointers are used as arguments.

Name	Symbol	Arguments	Pointer only args?	Comment
plus	+	const left, const right	Yes	Binary addition. Should return a new object with the result of the operation. Arguments are const.
unary_plus	+	const right	Yes	Monadic plus. This is normally blank for numeric arguments, but can be used for any purpose you wish with your own structures. Should return a new object.
plusasgn	+=	left, const right	No	Should return its (possibly modified) left argument.
increment	++	right	No	Post increment. Should return the unmodified value of its argument. Normally, as a side effect, the argument can be modified.
minus	-	const left, const right	No	Binary subtraction. Same constraints as binary +
unary_minus	-	const right	Yes	Monadic minus. Should subtract its argument from zero, in the numeric cases. Should return a new object. Normally it shouldn’t modify its right argument.
minusasgn	-=	left, const right	No	Should return its possibly modified left argument
decrement	--	left	No	Post decrement. Should return the unmodified value of its argument. Normally, as a side effect, the argument should be modified.
multiply	*	const left, const right	Yes	Multiplication. Should return a new object with the result of the operation.
multasgn	*=	left, const right	Yes	Should return its (possibly modified) left argument.
and	&	const left, const right	Yes	Should return a new object.
xor	^	const right , const left	Yes	Should return a new object
divide	/	const left, const right	Yes	Division. Should return a new object with the result of the operation.
divasgn	/=	left, const right	Yes	Should return its (possibly modified) left argument.
index	[][2]	left, right	No	Should return a reference to indexing the left argument with its right argument, that is always an integer expression.
indexasgn	[]=	left, index, right	No	Should be assigned to the left argument at the position indicated by indexing the new value given by right.
leftshift	<<	const left, const right	Yes	Should leave its right argument unmodified and return a (possibly modified) left argument.
rightshift	>>	const left, const right	Yes	Should leave its right argument unmodified and return a (possibly modified) left argument.
lshasgn	<<=	left, const right	Yes	Should return its (possibly modified) left argument.
rshasgn	>>=	left, const right	Yes	Should return its possibly modified left argument.
not	!	const right	Yes	Should return a new object. Normally it shouldn’t modify its right argument.
equal	==	const left, const right	No	Should return an integer other than zero if the comparison succeeds, otherwise zero. The arguments shouldn’t be modified.
lessequal	<=	const left, const right	No	Should return an integer other than zero if the comparison succeeds, otherwise zero. The arguments shouldn’t be modified.
greaterequal	>=	const left, const right	No	Should return an integer other than zero if the comparison succeeds, otherwise zero. The arguments shouldn’t be modified.
less	<	const left, const right	No	Should return an integer other than zero if the comparison succeeds, otherwise zero. The arguments shouldn’t be modified.
greater	>	const left, const right	No	Should return an integer other than zero if the comparison succeeds, otherwise zero. The arguments shouldn’t be modified.
notequal	!=	const left, const right	No	Should return an integer other than zero if the comparison fails, otherwise zero. The arguments shouldn’t be modified.
asgn	=	reference left, const right	Yes	Should return "left", that should be a reference to an object. If it is not declared as reference this operator will not work.

Differences to C++

In the C++ language, you can redefine the operators && (and) || (or) and , (comma). You cannot do this in C. The reasons are very simple.

In C (as in C++), logical expressions within conditional contexts are evaluated from left to right. If, in the context of the AND operator, the first expression returns a FALSE value, the others will NOT be evaluated. This means that once the truth or falsehood of an expression has been determined, evaluation of the expression ceases, even if some parts of the expression haven't yet been examined.

Now, if a user wanted to redefine the operator AND or the operator OR, the compiler would have to generate a function call to the user-defined function, giving it all the arguments of BOTH expressions. To make the function call, the compiler would have to evaluate them both, before passing them to the redefined operator&&.

Consequence: all expressions would be evaluated and expressions that rely on the normal behavior of C would not work. The same reasoning can be applied to the operator OR. It evaluates all expressions, but stops at the first that returns TRUE.

A similar problem appears with the Comma operator, which evaluates in sequence all the expressions separated by the comma(s), and returns as the value of the expression the last result evaluated. When passing the arguments to the overloaded function, however, there is no guarantee that the order of evaluation will be from left to right. The C standard does not specify the order for evaluating function arguments. Therefore, this would not work.[3]

Another difference with C++ is that here you can redefine the operator []=, i.e., the assignment to an array is a different operation than the reference of an array member. The reason is simple: the C language always distinguishes between the operator + and the operator +=, the operator * is different from the operator *=, etc. There is no reason why the operator [] should be any different.

This simple fact allows you to do things that are quite impossible for C++ programmers: You can easily distinguish between the assignment and the reference of an array, i.e., you can specialize the operation for each usage. In C++ doing this implies creating a “proxy” object, i.e., a stand-by construct that senses when the program uses it for writing or reading and acts accordingly. This proxy must be defined, created, etc., and it has to redefine all operators to be able to function. In addition, this highly complex solution is not guaranteed to work! The proxies have subtle different behaviors in many situations because they are not the object they stand for.

You do not need all of these complexities in your software. You are in control. This is C.[4]

Using Operator Overloading: An Example

To demonstrate how to use this, a simple implementation of the “<<” syntax of C++ for output is illustrated.

Description

The left shift operator will be redefined with each of the basic types. Start by defining a structure that will contain the necessary information for output.

typedef struct tagiostream {

FILE *f;

unsigned left:1; // Left-align values

unsigned right:1; // Right-align values; pad on the left

// with the fill character (default alignment).

unsigned dec:1; // Format numeric values in decimal

unsigned oct:1; // Format numeric values in base 8

unsigned hex:1; //Format numeric values as base 16 (hexadecimal).

unsigned showpoint:1; // Show decimal point and trailing zeros

// for floating-point values.

unsigned uppercase:1; //Display uppercase A through F

//for hexadecimal values and E for

// scientific notation.

unsigned showpos:1; // Show plus signs (+) for positive values.

unsigned scientific:1; // Display floating-point numbers

// in scientific format.

unsigned unitbuf:1; //Cause flush the stream after each insertion.

unsigned showbase:1; // Show the output base

unsigned char fill; //Sets or reads the streams fill character.

unsigned char precision;

unsigned char width;

} iostream;

This structure allows fine control of the displayed output.

Start defining the different types of operator << for the different types of right hand values. The simplest is the character string.

iostream * operator<<(iostream * f,char *p)

{

int n = fprintf(f->f,"%s",p);

while (n < f->width) {

fputc(f->fill,f->f);

n++;

}

return f;

}

You call fprintf with the underlying value, and put as many of the fill chars as needed to fill the current width. Note that the return value is the structure iostream received in input, so that several of these operators can be linked together.

The other types are similar: For instance, integers:

iostream * operator <<(iostream *f,int i)

{

unsigned char buf[50];

if (f->dec)

sprintf(buf,"%d",i);

else if (f->oct) {

if (f->showbase)

sprintf(buf,"0%o",i);

else

sprintf(buf,"%o",i);

}

else if (f->hex) {

if (f->showbase)

sprintf(buf,"0x%x",i);

else

sprintf(buf,"%x",i);

}

return f << buf;

}

Fill a buffer containing the different possible representations of an int, and then call the previously defined operator << for strings, that will handle the width options.

Provide a function that will return a new iostream with an open file pointer as input. Allocate the new structure with the standard “malloc” function, set the default fields, and return it.

iostream *new_ios(FILE *f)

{

iostream *result = (iostream *)malloc(sizeof(iostream));

memset(result,0,sizeof(iostream));

result->f = f;

result->precision = 6;

result->dec = 1;

result->fill = ‘ ‘;

return result;

}

A simple program can be used to test at this point:

int main(void)

{

iostream out = new_ios(stdout);

int a = 7;

out << “Hello ” << a << “\n”;

close_ios(out);

}

This will produce the output:

Hello 7

If you want to change the settings (output base say), you must manipulate the iostream structure directly. A more elegant solution is to define a new structure that will be a place holder. An appearance of that structure in the output stream would provoke no new output, but a change to the settings of the underlying iostream structure. Define the new structure as follows:

typedef enum {Flush,Dec,Oct,Hex} controlsequences;

typedef struct _control {

controlsequences val;

} Control;

Control flush = {Flush};

Control dec = {Dec};

Control oct = {Oct};

Control hex = {Hex};

Define a structure that contains just one field: a member of the enumeration of possible actions to perform. With this defined, you only have to add a new operator<< to assign meaning to those structures embedded in the output stream:

iostream * operator<<(iostream *f,Control e)

{

switch (e.val) {

case eflush:

fflush(f->f);

break;

case edec:

f->dec = 1;

f->oct = 0;

f->hex = 0;

break;

case eoct:

f->dec = 0;

f->oct = 1;

f->hex = 0;

break;

case ehex:

f->dec = 0;

f->oct = 0;

f->hex = 1;

break;

}

return f;

}

No output is performed, but the settings of the stream are changed. Use the following program to test this:

int main(void)

{

iostream out = new_ios(stdout);

int a = 12345;

cout << "Dec: " << a << oct << " Octal " << a <<

" Hex: " << hex << a << nl;

close_ios(out);

}

This will produce the output:

Dec: 12345 Octal 30071 Hex: 3039

See the iostream library documentation for a more detailed description of this feature and the source code of iostream.c in \lcc\src\iostream.

Another example of operator overloading is in the doubledouble directory, where a new numeric type (doubledouble) is introduced, which replaces the normal floating point data type and has extended precision.

[1] In lcc-win32 you can use pointers or references, and the ‘const’ attribute is not required. You can modify the arguments to a plus operator, but that could cause problems.

[2] Note that in C++ the operator [ ] can only be redefined at the class level. Since there are no classes here, this restriction does NOT apply.

[3] Note that this does not work in C++ either. Lengthy explanations are required to dissuade users from redefining these operators, the best one is in Scott Meyer’s book “Effective C++”, see “Item 7: Never overload &&, ||, or ,”. Personally, I do not understand why things should be present in a language only to then have to write lengthy explanations on why these features are not useful or are risky. A cleaner design would omit those features.

[4] In a (rather lively) discussion in the comp.std.c newsgroup jepler epler pointed that in Python:

“The expression i = x[j] will call the method x.__getitem__(j) and assign the returned value to i. The expression x[j] = i will call the method x.__setitem__(j, i)

I make the first out as being analogous to operator [], and the second as the proposed operator []=.”